Parameter estimation & hypothesis testing
University of Amsterdam
2025-11-14
In this lecture we aim to:
Each model has assigned a prior probability distribution to the parameter \(\theta\) by means of a probability distribution. In the case of a proportion, a convenient distribution is the Beta distribution because it also has a domain of [0,1].
\(\begin{aligned} k &= 8 \\ n &= 10 \end{aligned}\)
How likely is 8 out of 10 for all possible \(\theta\) values?
\[\binom{n}{k} \theta^k (1-\theta)^{n-k}\]
How likely is 8 out of 10 for all \(\theta\) values in the model, on average?
Now we can update Alex’ belief about the possible values of theta based on the data (the likelihood function) we found. Values with a likelihood higher than the average likelihood receive a boost, and others receive a penalty, in plausibility.
The posterior distribution is a probability distribution, so we can make probabilistic statements based on it:
In the Bayesian framework, this is known as the credible interval. This interval entails taking the middle \(x\)% of the posterior distribution. For instance, we can take a 95% credible interval, which ranges from \(0.482\) to \(0.940\) in this case. This means that under Alex’s model, there is a 95% probability that the true value of \(\theta\) is between \(0.482\) and \(0.940\).
\[\large \underbrace{\frac{P(H_A \mid data)}{P(H_0 \mid data)}}_\textrm{Posterior odds} = \underbrace{\frac{P(H_A)}{P(H_0)}}_\textrm{Prior odds} \times \underbrace{\frac{P(data \mid H_A)}{P(data \mid H_0)}}_\textrm{Bayes Factor}\]
\[\underbrace{\frac{P(data \mid H_A)}{P(data \mid H_0)}}_\textrm{Bayes Factor}\] A ratio of the marginal likelihood of the data for the alternative and the null models.
A Bayes factor of \({BF}_{10} = 3\), means that the data are 3 times more likely under the alternative model than under the null model.
Heuristics for the Interpretation of the Bayes Factor by Harold Jeffreys
| BF | Evidence |
|---|---|
| 1 – 3 | Anecdotal |
| 3 – 10 | Moderate |
| 10 – 30 | Strong |
| 30 – 100 | Very strong |
| >100 | Extreme |




Show model predictions using dummy variables:
\[ y_i = b_0 + b_1 \times x_i. \] where \(x_i\) indicates whether we are in group A or group B, then \(b_1\) indicates the group difference (if equal to 0, then no group difference)
\[ y_i = b_0 + b_1 \times x_i \]
\[\mathcal{M_0}: b_1 = 0 \] \[\mathcal{M_1}: b_1 \sim Cauchy(0.707)\]
\[\mathcal{M_0}: b_1 = 0 \] \[\mathcal{M_1}: b_1 \sim Cauchy(0.707)\]
How well did each model predict the data?
\(\rightarrow\) Marginal likelihood
\[\frac{P(data \mid \mathcal{M}_1)}{P(data \mid \mathcal{M}_0)} \approx 10,000\] The data are 10,000 times more likely under \(\mathcal{M}_1\) than under \(\mathcal{M}_0\)
\[ tastiness = b_0 + b_1 \times alcoholic + b_2 \times correct\]







Scientific & Statistical Reasoning